Skip to content

Rename _isQUOTEMETA to isQUOTEMETA_ #23555

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 12, 2025
Merged

Conversation

khwilliamson
Copy link
Contributor

The former is undefined behavior in C in some situations.

  • This set of changes does not require a perldelta entry.

@khwilliamson khwilliamson changed the title Rename isQUOTEMETA to isQUOTEMETA_ Rename _isQUOTEMETA to isQUOTEMETA_ Aug 10, 2025
@bulk88
Copy link
Contributor

bulk88 commented Aug 10, 2025

Dumb question, since I've never seen a written (.pod) C code style document for the Perl VM, I got a question, what suffix or prefix is used for an exported linker recognized C function, but that C function is 100% private, its private from CPAN, and its private from #define PERL_CORE. My commits always stick a _p suffix on the end of the symbol, when the C function is getting an invisible to CPAN and invisible to P5P change, basically an ABI tweak, that doesn't affect the VM's ASCII source code. Perl_tmps_grow_p was my renaming for example. But I see a bunch of _x suffixes pop up over the last 10 years.

95% of other FOSS C projects prefix or suffix the C linker symbol with _ when they wanna kill the linker symbol to prevent C linker accidents with ancient .a/.lib files or deployed ancient dl_sym/DLL loader accidents.

It would be nice to know if Perl API best practice is _x _p or _ to do a C linker symbol ABI change. x could mean the proto has a my_perl var, x could mean "Extended" just like how all MS API use Ex in 2000/XP era, ExEx in Win8/10, and now ExEx2 in Win 11 era. But Perl 5 traditionally uses the _flags suffix for all "Extended" calls.

Nothing in Perlhacktips perlintern or perlguts has any guidance for best practices.

the chance of #define _isQUOTEMETA or _isQUOTEMETA linker symbol colliding with other SW is very low, it violates both Microsoft and Unix naming conventions. In very rare i386 Win32 cases, per i386 ABI/C linker name manging rules (now removed for Win64 ABI), C linker symbol/.i file token isQUOTEMETA , name mangles to _isQUOTEMETA if you look inside the hex dump of a .o/.obj.

If you mess around with __stdcall __fastcall and __thiscall tokens, you can declare and define and add a body, to a C linker symbol/.i file token _isQUOTEMETA function, BUTTT!!!!!!, when you grep the hex dump of the .o/.obj file, _isQUOTEMETA DID NOT become __isQUOTEMETA, its name remained untouched as _isQUOTEMETA inside the C linker process/SQL DB/binary stiching phase.

This ^^^^^ caused actual SEGVs in WinPerl until 5.13 when I added the DATA token to the .def file on this line.

https://github.com/Perl/perl5/blob/blead/makedef.pl#L1261

There (was in 2025) is "production CPAN module code" that demands ISO C compliance for unprototyped C functions feature!!!!

Who needs #include "EXTERN.h" \n #include "perl.h" \n??? FOOLs!!!

Why should my PC waste an extra 30 seconds or 2 minutes compiling P5P's BLOATWARE .h files?

None of that is needed to make a proper .so file that Dynaloader.pm can successfully load (in 2005-2015 era C compilers).

Result of messing with __stdcall and __fastcall or a .so or .dll full of XSUBs from TUs that never did #include "perl.h"-ed is instant SEGVs on Win32, since Perl's exported char PL_Yes[2] symbol, suddenly matched the signature of executable C function called

extern __noreturn OP* PL_Yes(int segv_per_second_setting);

extern __noreturn OP* PL_Yes(int segv_per_second_setting); is a reference to an inside joke of Perl 5.000 not alpha-5.43's public C API, I wonder who will get it. Note the hint "5.000 not alpha", so its obviously some design mistake on 5.000 initial commit Larry made and immediately corrected in less than 12 months. The mistake is still visible in 5.43.

I think the 5.10 or 5.12 Win32 "DATA" token crashes were because die() and warn() used have linker names of "die" and "warn" and someone has a `extern const char die[] = "fatal error: %d"; line in their .xs file.

I dont think this category of SEGVs/ABI flaws with _ __ and ___ exists anymore on Perl > 5.30 releases. Those OSes and their CCs were removed from the repo.

P5P's visibility("hidden") token on POSIX Perl did a nuclear bomb drop on ancient unmaintained CPANTesters always or mostly failing CPAN tarballs that did crazy stuff like what I described above. Not 1 bug ticket with RT was ever filed after the visibility("hidden") bomb drop.

@bulk88
Copy link
Contributor

bulk88 commented Aug 10, 2025

I dont see any problem with the name change itself, I think the _ on the right looks better, my only trivial concern is the commit message is claiming its a dire technical engineering reason, that nobody has reproduced since the early 1980s. While in fact this is a code readability/style/prettyness/future maintenance improvement. Its not an engineering problem. My ABI discussions above were how intentionally normal non-insane code (freelancer temp gig software projects) were able to once in a blue moon trigger the 0 _1 __2 or ___3 linker phase/TU/C lexer phase disasters, and it always involved never reading the man page for the compiler on your hard drive.

A made a POC branch some months ago, not-abusing but using this 0 _1 __2 or ___3 linker phase/parser phase confusion, to redesign WinPerl's horrifically slow Perl_get_context() function, into pthread_t mainline Perl_get_context() symbol (it a PLT/GOT function that is 5 CPU instructions long, WinPerl's is 450 CPU instructions since 1996-2025), or straight up the TLS data is a 100% legit C lang semi-global data variable using secret GAS in a secret .h file by certain builds of perl by certain entities for certain linux distros.

The good using of this ABI C linker insanity, is to completely optimize out the Win32 Symbol table rules (cough cough 3 memory reads through the PLT/GOT), so CPAN .dlls can access perl5xx.dll's TLS offset true global variable backed by .data storage inside perl5xx.dll, without using the Win32 symbol table at all. And without declaring a secret global static void * variable inserted by perl.h or EUPXS, that gets secretly 1x ever initialized when void boot__My__DynaLoad__Module_BOOT(my_perl,cv); executes.

#ifdef PERL_CORE

extern __declspec(dllexport) void * PL_thr_key;

#else

extern void __cdecl PL_thr_key(void);
/* PL_thr_key is exactly 1 memory read away now in machine code, between 2 separate .dll files
    without punitive  mmap COW breaking RSS "relocation"  for having an absolute
    non-PIC C lang pointer in .rdata or .data. Not available on 20th century OSes like
   Linux and ELF. Apple and Google call this "tab discarding".

  Wait a minute, why doesn't this work on POSIX Perl anymore https://perldoc.perl.org/functions/dump ?
*/
PUSHs(Perl_sv_2mortal( ((interpreter*)PL_thr_key), sv));

#endif

Sadly this optimization discriminates against WinPerl on AMD64 builds, or WinPerl on i386 builds because of ABI/C name mangling rules between the 2 different CPUs. I forgot which. I need to insert a lower 7 bit ASCII "@" character into PL_thr_key to pull this off on the other CPU arch. Now that means the Perl P5P repo gets its first 4 lines long .S file in history.

@bulk88
Copy link
Contributor

bulk88 commented Aug 10, 2025

TLDR: 20 or 25 years ago NIS JHI or Nick Clark immunized for the next 1000 years libperl.so/perl5xx.dll file against the curse of _ 1st on the left side of an identifier in C lang by NIS or JHI or NC putting a "PL_" and "Perl_" suffix on the entire public and non public undocumented extern "C" Perl C API, and then doing some magic tricks with the C PreProcessor.

@khwilliamson
Copy link
Contributor Author

It is not dire, and it appears you misunderstand why it is being done. The C standard says that symbol names that begin with an underscore followed by an uppercase letter in file scope are reserved for the C implementation's use (Section 7.1.3 of C99). The blead name violates this. The compiler could croak on this, but none I'm aware of do. It's not a problem unless we stumble upon an identifier that they actually use. So we are gradually converting the illegal ones we have invented to legal ones. That's all this commit does. It has nothing to do with linking.

@tonycoz
Copy link
Contributor

tonycoz commented Aug 11, 2025

Why have an underscore at all?

@khwilliamson
Copy link
Contributor Author

To make someone think twice about using it, being undocumented hasn't stopped people in the past from using things, and then they put up a stink should it change.

I don't want to have to support this publicly, so the underscore is a bit of extra insurance

@tonycoz
Copy link
Contributor

tonycoz commented Aug 11, 2025

I don't want to have to support this publicly, so the underscore is a bit of extra insurance

Why not just guard it with PERL_CORE?

The former is undefined behavior in C in some situations.
The rules are detailed in "Choosing legal symbol names" in perlhacktips

This commit omits the underscore and makes the macro available only to
the core and extensions.  I think I didn't know about PERL_CORE when I
created this macro, so used the leading underscore to discourage its
use from anyone who stumbled upon its existence.  The reason not to make
it public was my uncertainty about if it was the correct thing to do;
my not expecting that it would be useful outside of core; and the
extra work needed to make it a polished interface.  I think now that if
someone requested it be made public, that could safely be done.
@khwilliamson
Copy link
Contributor Author

Changed as @tonycoz suggested

@khwilliamson khwilliamson merged commit d6de6d7 into Perl:blead Aug 12, 2025
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants